复杂物理动态的建模和控制在真实问题中是必不可少的。我们提出了一种新颖的框架,通常适用于通过用特殊校正器引入PDE解决方案操作员的代理模型来解决PDE受约束的最佳控制问题。所提出的框架的过程分为两个阶段:解决PDE约束(阶段1)的解决方案操作员学习并搜索最佳控制(阶段2)。一旦替代模型在阶段1训练,就可以在没有密集计算的阶段2中推断出最佳控制。我们的框架可以应用于数据驱动和数据的案例。我们展示了我们对不同控制变量的各种最优控制问题的成功应用,从泊松方程到汉堡方程的不同PDE约束。
translated by 谷歌翻译
有源推断可以被定义为具有生物可粘合模型的脑的贝叶斯建模。其主要思想依赖于自由能原理和药剂的优先偏好。代理人将选择一个导致其前后偏好的行动,以便将来的观察结果。在本文中,我们声称可以使用强化学习(RL)算法来解释有源推断,并在它们之间找到理论连接。我们扩展了预期的自由能量(EFE)的概念,这是有源推理的核心量,并要求EFE可以被视为负值函数。通过前后偏好的概念和理论连接的概念,我们提出了一种简单但新的方法来学习从专家的先前偏好。这说明可以通过有源推断的新视角来接近逆R1的问题。先前偏好学习的实验结果表明,基于EFE的奖励和应用于反向RL问题的可能性。
translated by 谷歌翻译
The cone-beam computed tomography (CBCT) provides 3D volumetric imaging of a target with low radiation dose and cost compared with conventional computed tomography, and it is widely used in the detection of paranasal sinus disease. However, it lacks the sensitivity to detect soft tissue lesions owing to reconstruction constraints. Consequently, only physicians with expertise in CBCT reading can distinguish between inherent artifacts or noise and diseases, restricting the use of this imaging modality. The development of artificial intelligence (AI)-based computer-aided diagnosis methods for CBCT to overcome the shortage of experienced physicians has attracted substantial attention. However, advanced AI-based diagnosis addressing intrinsic noise in CBCT has not been devised, discouraging the practical use of AI solutions for CBCT. To address this issue, we propose an AI-based computer-aided diagnosis method using CBCT with a denoising module. This module is implemented before diagnosis to reconstruct the internal ground-truth full-dose scan corresponding to an input CBCT image and thereby improve the diagnostic performance. The external validation results for the unified diagnosis of sinus fungal ball, chronic rhinosinusitis, and normal cases show that the proposed method improves the micro-, macro-average AUC, and accuracy by 7.4, 5.6, and 9.6% (from 86.2, 87.0, and 73.4 to 93.6, 92.6, and 83.0%), respectively, compared with a baseline while improving human diagnosis accuracy by 11% (from 71.7 to 83.0%), demonstrating technical differentiation and clinical effectiveness. This pioneering study on AI-based diagnosis using CBCT indicates denoising can improve diagnostic performance and reader interpretability in images from the sinonasal area, thereby providing a new approach and direction to radiographic image reconstruction regarding the development of AI-based diagnostic solutions.
translated by 谷歌翻译
Image super-resolution is a common task on mobile and IoT devices, where one often needs to upscale and enhance low-resolution images and video frames. While numerous solutions have been proposed for this problem in the past, they are usually not compatible with low-power mobile NPUs having many computational and memory constraints. In this Mobile AI challenge, we address this problem and propose the participants to design an efficient quantized image super-resolution solution that can demonstrate a real-time performance on mobile NPUs. The participants were provided with the DIV2K dataset and trained INT8 models to do a high-quality 3X image upscaling. The runtime of all models was evaluated on the Synaptics VS680 Smart Home board with a dedicated edge NPU capable of accelerating quantized neural networks. All proposed solutions are fully compatible with the above NPU, demonstrating an up to 60 FPS rate when reconstructing Full HD resolution images. A detailed description of all models developed in the challenge is provided in this paper.
translated by 谷歌翻译
For ensuring vehicle safety, the impact performance of wheels during wheel development must be ensured through a wheel impact test. However, manufacturing and testing a real wheel requires a significant time and money because developing an optimal wheel design requires numerous iterative processes to modify the wheel design and verify the safety performance. Accordingly, wheel impact tests have been replaced by computer simulations such as finite element analysis (FEA); however, it still incurs high computational costs for modeling and analysis, and requires FEA experts. In this study, we present an aluminum road wheel impact performance prediction model based on deep learning that replaces computationally expensive and time-consuming 3D FEA. For this purpose, 2D disk-view wheel image data, 3D wheel voxel data, and barrier mass values used for the wheel impact test were utilized as the inputs to predict the magnitude of the maximum von Mises stress, corresponding location, and the stress distribution of the 2D disk-view. The input data were first compressed into a latent space with a 3D convolutional variational autoencoder (cVAE) and 2D convolutional autoencoder (cAE). Subsequently, the fully connected layers were used to predict the impact performance, and a decoder was used to predict the stress distribution heatmap of the 2D disk-view. The proposed model can replace the impact test in the early wheel-development stage by predicting the impact performance in real-time and can be used without domain knowledge. The time required for the wheel development process can be reduced by using this mechanism.
translated by 谷歌翻译
本文介绍了持续的Weisfeiler-Lehman随机步行方案(缩写为PWLR),用于图形表示,这是一个新型的数学框架,可生成具有离散和连续节点特征的图形的可解释的低维表示。提出的方案有效地结合了归一化的Weisfeiler-Lehman程序,在图形上随机行走以及持续的同源性。因此,我们整合了图形的三个不同属性,即局部拓扑特征,节点度和全局拓扑不变,同时保留图形扰动的稳定性。这概括了Weisfeiler-Lehman过程的许多变体,这些变体主要用于嵌入具有离散节点标签的图形。经验结果表明,可以有效地利用这些表示形式与最新的技术产生可比较的结果,以分类具有离散节点标签的图形,并在对具有连续节点特征的人分类中增强性能。
translated by 谷歌翻译
最近的深度学习模型在言语增强方面已经达到了高性能。但是,获得快速和低复杂模型而没有明显的性能降解仍然是一项挑战。以前的知识蒸馏研究对言语增强无法解决这个问题,因为它们的输出蒸馏方法在某些方面不符合语音增强任务。在这项研究中,我们提出了基于特征的蒸馏多视图注意转移(MV-AT),以在时域中获得有效的语音增强模型。基于多视图功能提取模型,MV-AT将教师网络的多视图知识传输到学生网络,而无需其他参数。实验结果表明,所提出的方法始终提高瓦伦蒂尼和深噪声抑制(DNS)数据集的各种规模的学生模型的性能。与基线模型相比,使用我们提出的方法(一种用于有效部署的轻巧模型)分别使用了15.4倍和4.71倍(FLOPS),与具有相似性能的基线模型相比,Many-S-8.1GF分别达到了15.4倍和4.71倍。
translated by 谷歌翻译
磁共振图像的降解有益于提高低信噪比图像的质量。最近,使用深层神经网络进行DENOSING表现出了令人鼓舞的结果。但是,这些网络大多数都利用监督学习,这需要大量的噪声和清洁图像对的培训图像。获得训练图像,尤其是干净的图像,既昂贵又耗时。因此,已经开发了仅需要成对噪声浪费图像的噪声2Noise(N2N)之类的方法来减轻获得训练数据集的负担。在这项研究中,我们提出了一种新的自我监督的denoising方法Coil2Coil(C2C),该方法不需要获取干净的图像或配对的噪声浪费图像进行训练。取而代之的是,该方法利用了从分阶段阵列线圈中的多通道数据来生成训练图像。首先,它将多通道线圈图像分为两个图像,一个用于输入,另一个用于标签。然后,它们被处理以施加噪声独立性和敏感性归一化,以便它们可用于N2N的训练图像。为了推断,该方法输入了一个线圈组合的图像(例如DICOM图像),从而允许该方法的广泛应用。当使用合成噪声添加的图像进行评估时,C2C对几种自我监督方法显示了最佳性能,从而报告了与监督方法的可比结果。在测试DICOM图像时,C2C成功地将真实噪声降低,而没有显示误差图中的结构依赖性残差。由于不需要对清洁或配对图像进行额外扫描的显着优势,因此可以轻松地用于各种临床应用。
translated by 谷歌翻译
使用变压器模型,多语言神经机器的翻译一直显示出巨大的成功。部署这些模型是具有挑战性的,因为它们通常需要各种语言的大词汇(词汇)尺寸。这限制了在上一个词汇投影层中预测输出令牌的速度。为了减轻这些挑战,本文提出了一种通过聚类的快速词汇投影方法,该方法可用于GPU上的多语言变压器。首先,我们脱机将词汇搜索空间分为不同的结合群,鉴于解码器输出的隐藏上下文向量,这导致词汇投影的词汇列要小得多。其次,在推理时,提出的方法预测了词汇投影中隐藏上下文向量的簇和候选候选代币。本文还包括对在多语言环境中构建这些群集的不同方式的分析。我们的结果表明,FLOAT16 GPU推断中的端到端速度增长高达25%,同时保持BLEU得分并略有增加记忆成本。所提出的方法将词汇投影步骤加速自身最多2.6倍。我们还进行了广泛的人类评估,以验证所提出的方法保留了原始模型的翻译质量。
translated by 谷歌翻译
在线巨魔增加了社会成本,并对个人造成心理损害。随着自动化帐户利用机器人进行拖钓的扩散,目标个人用户很难在定量和定性上处理这种情况。为了解决这个问题,我们专注于自动化对抗巨魔的方法,因为对战斗巨魔的反应鼓励社区用户在不损害言论自由的情况下保持持续的讨论。为此,我们为自动反响应生成提出了一个新颖的数据集。特别是,我们构建了一个配对数据集,其中包括巨魔评论和使用标记的响应策略的反响应,该策略使我们的数据集中的模型可以通过根据指定策略改变反响应来生成响应。我们执行了三个任务来评估数据集的有效性,并通过自动和人类评估评估结果。在人类评估中,我们证明了数据集中微调的模型显示出策略控制的句子生成的性能有了显着改善。
translated by 谷歌翻译